58
Visual Recognition And Search Columbia University, Spring 2013 1 EECS 6890 – Topics in Information Processing Spring 2013, Columbia University http://rogerioferis.com/VisualRecognitionAndSearch Class 3 Feature Coding and Pooling Liangliang Cao, Feb 7, 2013

Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20131

EECS 6890 – Topics in Information Processing

Spring 2013, Columbia University

http://rogerioferis.com/VisualRecognitionAndSearch

Class 3 Feature Coding and Pooling

Liangliang Cao, Feb 7, 2013

Page 2: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20132

Problem

婴儿婴儿婴儿婴儿 bebé아가아가아가아가ब�चा

People may have difficulties

to understand different texts,

but NOT images.

Can you understand the following?

Photo courtesy to luster

The Myth of Human Vision

Page 3: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20133

Can the computer vision system

recognize objects or scenes like

human?

Problem

Page 4: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20134

Why This Problem Important?

Searching enginesTraditional companies Mobile Apps

Page 5: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20135

Problem

http://www.vision.caltech.edu

Examples of Object Recognition Dataset

Page 6: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20136

http://groups.csail.mit.edu/vision/SUN/

Problem

Page 7: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20137

Overview of Classification Model

Coding

Pooling

Histogram of SIFT

Uncertainty-

Based

Quantization

Sparse

Coding

Fisher vector/

Supervector

Vector quantization

Histogramaggregation

Soft quantization

Soft quantization

GMM probability estimation

Histogram aggregation

Max pooling GMM adptation

Coding: to map local features into a compact representation

Pooling: to aggregate these compact representation together

Page 8: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20138

Outline

• Histogram of local features

• Bag of words model

• Soft quantization and sparse coding

• Fisher vector and supervector

Outlines

Page 9: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 20139

Histogram of Local Features

Page 10: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201310

Bag of Words Models

• Powerful local features

– DoG

– Hessian, Harris

– Dense-sampling

Recall of Last Class

Non-fixed number oflocal regions per image!

Page 11: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201311

Bag of Words Models

• Histograms can provide a fixed size representation of

images

• Spatial pyramid/gridding can enhance histogram

presentation with spatial information

Recall of Last Class (2)

Page 12: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201312

Bag of Words Models

Histogram of Local Features

…..

frequency

codewords dim = # codewords

Page 13: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201313

Bag of Words Models

Histogram of Local Features (2)

dim = #codewords x #grids

……

Page 14: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201314

Local Feature Quantization

Bag of Words Models

Slide courtesy to Fei-Fei Li

Page 15: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201315

Local Feature Quantization

Bag of Words Models

Page 16: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201316

Local Feature Quantization

Bag of Words Models

- Vector quantization- Dictionary learning

Page 17: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201317

Dictionary for Codewords

Histogram of Local Features

Pix

ture

court

esy t

o F

ei-F

eiLi

Page 18: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201318

Bag of Words Models

Most slides in this section are courtesy to Fei-Fei Li

Page 19: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201319

ObjectObject Bag of Bag of ‘‘wordswords’’

Page 20: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201320

Bag of Words Models

Of all the sensory impressions proceeding to the brain, the visual experiences are the dominant ones. Our perception of the world around us is based essentially on the messages that reach the brain from our eyes. For a long time it was thought that the retinal image was transmitted point by point to visual centers in the brain; the cerebral cortex was a movie screen, so to speak, upon which the image in the eye was projected. Through the discoveries of Hubel and Wiesel we now know that behind the origin of the visual perception in the brain there is a considerably more complicated course of events. By following the visual impulses along their path to the various cell layers of the optical cortex, Hubel and Wiesel have been able to demonstrate that the message about the image falling on the retina undergoes a step-wise analysis in a system of nerve cells stored in columns. In this system each cell has its specific function and is responsible for a specific detail in the pattern of the retinal image.

sensory, brain, visual, perception,

retinal, cerebral cortex,eye, cell, optical

nerve, imageHubel, Wiesel

China is forecasting a trade surplus of $90bn (£51bn) to $100bn this year, a threefold increase on 2004's $32bn. The Commerce Ministry said the surplus would be created by a predicted 30% jump in exports to $750bn, compared with a 18% rise in imports to $660bn. The figures are likely to further annoy the US, which has long argued that China's exports are unfairly helped by a deliberately undervalued yuan. Beijing agrees the surplus is too high, but says the yuan is only one factor. Bank of China governor Zhou Xiaochuan said the country also needed to do more to boost domestic demand so more goods stayed within the country. China increased the value of the yuan against the dollar by 2.1% in July and permitted it to trade within a narrow band, but the US wants the yuan to be allowed to trade freely. However, Beijing has made it clear that it will take its time and tread carefully before allowing the yuan to rise further in value.

China, trade, surplus, commerce,

exports, imports, US, yuan, bank, domestic,

foreign, increase, trade, value

Underlining Assumptions - Text

Page 21: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201321

Bag of Words Models

Underlining Assumptions - Image

Page 22: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201322

categorycategory

decisiondecision

learninglearning

feature detection& representation

KK--meansmeans

image representation

category modelscategory models

(and/or) classifiers(and/or) classifiers

recognitionrecognition

Page 23: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201323

Bag of Words Models

Borrowing Techniques from Text Classification

• PLSA

• Naïve Bayesian Model

• wn: each patch in an image

– wn = [0,0,…1,…,0,0]T

• w: a collection of all N patches in an image

– w = [w1,w2,…,wN]

• dj: the jth image in an image collection

• c: category of the image

• z: theme or topic of the patch

No

tati

on

s

Page 24: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201324

Hoffman, 2001

w

N

d z

D

w

N

c z

D

π

Blei et al., 2001

Probabilistic Latent Semantic Analysis (pLSA)

Latent Dirichlet Allocation (LDA)

Bag of Words Models

Page 25: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201325

w

N

d z

D

Bag of Words Models

Probabilistic Latent Semantic Analysis (pLSA)

“face”

Sivic et al. ICCV 2005

Page 26: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201326

wN

d z

D

Observed codeworddistributions

Codeword distributionsper theme (topic)

Theme distributionsper image

Slide credit: Josef Sivic

∑=

=

K

k

jkkiji dzpzwpdwp1

)|()|()|(

Bag of Words Models

Parameter estimated by EM or Gibbs sampling

Page 27: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201327

)|(maxarg dzpzz

=∗

Slide credit: Josef Sivic

Bag of Words Models

Recognition using pLSA

Page 28: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201328

w

N

c z

D

π

Latent Dirichlet Allocation (LDA)

Fei-Fei et al. ICCV 2005

“beach”

Bag of Words Models

Scene Recognition using LDA

Page 29: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201329

Bag of Words Models

Spatial-Coherent Latent Topic Model

Cao and Fei-Fei, ICCV 2007

Page 30: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201330

Bag of Words Models

Simultaneous Segmentation and Recognition

Page 31: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201331

But these models suffer from

- Loss of information in quantization of “visual words”

- Loss of spatial information

Bag of Words Models

Pros and Cons

Images differ from texts!

Better coding

Better pooling

Bag of Words Models are good in

- Modeling prior knowledge

- Providing intuitive interpretation

Page 32: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201332

Soft Quantization and Sparse Coding

Page 33: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201333

Soft Quantization

Hard Quantization

Page 34: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201334

Soft Quantization

Model the uncertainty across multiple codewords

Uncertainty-Based Quantization

Gemert et al, Visual Word Ambiguity, PAMI 2009

Page 35: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201335

Soft Quantization

Intuition of UNC

Hard quantization

Soft quantization based on uncertainty

Gemert et al, Visual Word Ambiguity, PAMI 2009

Page 36: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201336

Soft Quantization

Improvement of UNC

Page 37: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201337

Soft Quantization

Hard quantization can be viewed as an “extremely

sparse representation”

A more general but hard to solve representation

In practice we consider

Sparse Coding-Based Quantization

s.t.

Sparse coding

Page 38: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201338

Soft Quantization

Hard quantization can be viewed as an “extremely

sparse representation”

A more general but hard to solve representation

In practice we consider

Sparse Coding-Based Quantization

s.t.

Sparse coding

Page 39: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201339

Soft Quantization

Yang et al obtain good

recognition accuracy by

combining sparse

coding with spatial

pyramid and dictionary

training.

Yang et al, Linear Spatial Pyramid Matching using Sparse Coding for Image Classification, CVPR 2009

More details will be in

group presentation.

Page 40: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201340

Fisher Vector and Supervector

One of the most powerful image/video classification techniques

Thanks to Zhen Li and Qiang Chen constructive suggestions to this section

Page 41: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201341

Fisher Vector and Supervector

Winning Systems

2009 2010

Classification task

2011 2012

2010 2011

Large Scale Visual

Recognition Challenge

Page 42: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201342

Fisher Vector and Supervector

Literature

These papers are not very easy to read.

Let me take a simplified perspective via coding&pooling framework

[5] Perronnin et al, Improving the Fisher kernel for large-scale image classification, ECCV 2010[6] Jégou et al, Aggregating local image descriptors into compact codes PAMI 2011.

Fisher Vector

[1] Yan et al, Regression from patch-kernel. CVPR 2008[2] Zhou et al, SIFT-Bag kernel for video event analysis. ACM Multimedia 2008[3] Zhou et al, Hierarchical Gaussianization for image classification. ICCV 2009: 1971-1977[4] Zhou et al, Image classification using super-vector coding of local image descriptors. ECCV 2010

Supervector

Page 43: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201343

Fisher Vector and Supervector

• Coding with hard assignment

• Coding with soft assignment

• How to keep all the information?

Coding without Information Loss

Page 44: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201344

Fisher Vector and Supervector

• Coding with hard assignment

• Coding with soft assignment

• How to keep all the information?

Coding without Information Loss

Page 45: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201345

Fisher Vector and Supervector

An Intuitive Illustration

Coding

Page 46: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201346

Fisher Vector and Supervector

An Intuitive Illustration

CodingComponent 1 Component 2 Component 3

Page 47: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201347

Fisher Vector and Supervector

An Intuitive Illustration

Component 1 Component 2 Component 3

+ +

+ + +

Pooling

+

Page 48: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201348

Fisher Vector and Supervector

Implementation of Supervector

In speech (speaker identification), supervector refer to

stacked means of adaptive GMMs.

Supervector =

Page 49: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201349

Origin distribution

Fisher Vector and Supervector

Interpretation with Supervector

New distribution

Picture from Reynolds, Quatieri, and Dunn, DSP, 2001

Page 50: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201350

In practice, a normalization process using the

covariance matrix often improves the performance

Moreover, we can subtract the original mean vector for

the ease of normalization

Fisher Vector and Supervector

Normalization of Supervector

The representation is also called Hierarchical Gaussianization

(HG).

Page 51: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201351

Fisher Vector and Supervector

Fisher Vector

Now we can define the Fisher Kernel

where is called Fisher information matrix

[Jaakkola and Haussler , NIPS 98] suggested X can be

described by the derivative subject to

Let be the probability density function with para

Page 52: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201352

Fisher Vector and Supervector

Fisher Vector with GMM

Let

Consider the Gaussian Mixture Model

We consider

With GMM, Fisher vectors can be obtained:

The Fisher vector

Page 53: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201353

Fisher Vector and Supervector

• Supervector

• Fisher vector

Comparison

Page 54: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201354

Fisher Vector and Supervector

Comparison

Diagonal covariance matrix

Diagonal covariance with same derivationPosterior estimation of

The two representations are almost the same even with different motivations.

Page 55: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201355

Fisher Vector and Supervector

• Learn from existing code

http://lear.inrialpes.fr/src/inria_fisher/ (Linux or Mac)

• Learn from public GMM code

• Be careful with pitfalls

– Probability is comparable to machine’s rounding error:

compute logP instead of P

– Try different normalization strategy

– Try to make the code efficient

How to Code Your Own

Page 56: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201356

Summary

Coding

Pooling

Histogram of SIFT

Uncertainty-

Based

Quantization

Sparse

Coding

Fisher vector/

Supervector

Vector quantization

Histogramaggregation

Soft quantization

Soft quantization

GMM probability estimation

Histogram aggregation

Max pooling GMM adptation

Page 57: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201357

• Read the deformable part model

• Enjoy the Rogerio’s talk next week ☺

• Project proposal deadline Feb 19

• Project presentation Feb 21

Todo Before Next Class

Page 58: Coding and Pooling llcao - IBM...exports, imports, US, yuan, bank, domestic, foreign, increase, trade, value Underlining Assumptions -Text Visual Recognition And Search 21 Columbia

Visual Recognition And Search Columbia University, Spring 201358

• Simplest model: histogram of visual words

• Models with good illustration: PLSA, LDA, S-LTM, …

• Soft quantization: soft quantization and sparse coding

• Very good performance: fisher vector or super vector

Summary