52
1/104 Computer Vision 3. Machine Vision Algorithms Computer Engineering, Sejong University Dongil Han 2/104 SIFT(Scale Invariant Feature Transform) Adaboost(Adaptive Boosting) SVM(Support Vector Machine) Precision and Recall DL(Deep Learning) Contents

C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

Embed Size (px)

Citation preview

Page 1: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

1/104

Computer Vision

3. Machine Vision Algorithms

Computer Engineering, Sejong University

Dongil Han

2/104

SIFT(Scale Invariant Feature Transform)

Adaboost(Adaptive Boosting)

SVM(Support Vector Machine)

Precision and Recall

DL(Deep Learning)

Contents

Page 2: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

3/104

Scale Invariant Feature Transform

Scale-invariant feature transform (or SIFT) is an algorithm in computer vision to detect and describe local features in images. The algorithm was published by David Lowe in 1999.

Applications include object recognition, robotic mapping and navigation, image stitching, 3D modeling, gesture recognition, video tracking, and match moving.

The algorithm is patented in the US; the owner is the University of British Columbia.

…. David LoweComputer Science DepartmentUniversity of British Columbia

4/104

Challenges

• Scale change

• Rotation

• Occlusion

• Illumination

……

Page 3: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

5/104

Overview of SIFT

6/104

Scale Space extrema detection

• Find the points, whose surrounding patches (with some scale) are distinctive

• An approximation to the scale-normalized Laplacian of Gaussian

Page 4: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

7/104

Maxima and minima in a 3*3*3 neighborhood

8/104

Locate DOG Extrema

• There are still a lot of points, some of them are not good enough.

• The locations of keypoints may be not accurate.

Page 5: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

9/104

Edge Response Elimination

• Such a point has large principal curvature across the edge but a small one in the perpendicular direction

• The principal curvatures can be calculated from a Hessian function

• The eigenvalues of H are proportional to the principal curvatures, so two eigenvalues shouldn’t diff too much

10/104

Page 6: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

11/104

Orientation assignment

• Assign an orientation to each keypoint, the keypoint descriptor can be represented relative to this orientation and therefore achieve invariance to image rotation

• Compute magnitude and orientation on the Gaussian smoothed images

12/104

Orientation assignment

• A histogram is formed by quantizing the orientations into 36 bins;

• Peaks in the histogram correspond to the orientations of the patch;

• For the same scale and location, there could be multiple keypoints with different orientations;

Page 7: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

13/104

Feature Descriptor

• Based on 16*16 patches

• 4*4 subregions

• 8 bins in each subregion

• 4*4*8=128 dimensions in total

14/104

Application : Object recognition

• The SIFT features of training images are extracted and stored

• For a query image

1. Extract SIFT feature

2. Efficient nearest neighbor indexing

3. 3 keypoints, Geometry verification

Page 8: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

15/104

16/104

SURF(Speeded Up Robust Feature)

• Approximate version of SIFT

• Works almost equally well

• Very fast

Page 9: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

17/104

SIFT Summary

• The most successful feature (probably the most successful paper in computer vision)

• A lot of heuristics, the parameters are optimized based on a small and specific dataset. Different tasks should have different parameter settings.

• Learning local image descriptors (Winder et al 2007): tuning parameters given their dataset.

• We need a universal objective function.

18/104

Adaptive Boosting(Adaboost)

• Introduced by Schapire and Freund in 1990s.

• “Boosting”: convert a weak learning algorithm into a strong one.

• Main idea: Combine many weak classifiers to produce a powerful committee.

• Algorithms:– AdaBoost: adaptive boosting– Gentle AdaBoost– BrownBoost– …

Page 10: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

19/104

Classifiers• Classifier Examples

– Linear Classifier

– Quadratic Classifier

– Nonlinear Classifier

+

++

+

+

test set example Classifier examples

+

++

+

+

+

++

+

+

20/104

Adaboost Learning Algorithm

Page 11: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

21/104

Adaboost Learning Algorithm• Round 1 of 3

+

++

+

+

+

++

+

+

D21 = 0.300

1=0.424

h1

22/104

+

++

+

+

+

++

+

+

D2h22 = 0.196

2=0.704

Adaboost Learning Algorithm• Round 2 of 3

Page 12: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

23/104

Adaboost Learning Algorithm

+

++

+

+

h3

3 = 0.344

2=0.323

STOP

Adaboost Learning Algorithm• Round 3 of 3

24/104

Adaboost Learning AlgorithmAdaboost Learning Algorithm• Final Hypothesis

Page 13: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

25/104

Adaboost Learning Algorithm• Learning Process

26/104

Adaboost Learning Algorithm• Final Classifier and parameters

Page 14: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

27/104

Algorithm Recapitulation

28/104

Page 15: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

29/104

30/104

Adaboost in Face Detection

Page 16: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

31/104

Adaboost in Face Detection• Adaboost Learning Algorithm by Froba and Ernst

– Detector has four classifiers of increasing complexity• analyzes image patches W of size 22x22 after MCT : Γ

• location within the analysis window W: x

• Adaboost classifier stage: j

• Classifier : H, Pixel Classifier : hx

32/104

Adaboost Summary

• Advantages- Very simple to implement- Does feature selection resulting in relatively simple classifier- Fairly good generalization

• Disadvantages- Suboptimal solution- Sensitive to noisy data and outliers

Page 17: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

33/104

Support Vector Machine(SVM)

• A classifier derived from statistical learning theory by Vapnik, et al. in 1992

• SVM became famous when, using images as input, it gave accuracy comparable to neural-network with hand-designed features in a handwriting recognition task

• Currently, SVM is widely used in object detection & recognition, content-based image retrieval, text recognition, biometrics, speech recognition, etc.

• Also used for regression

• Demo of SVM

http://www.csie.ntu.edu.tw/~cjlin/libsvm/V. Vapnik

These slides are courtesy of www.iro.umontreal.ca/~pift6080/documents/papers/svm_tutorial.ppt

34/104

Discriminant Function

• It can be arbitrary functions of x, such as:

Nearest Neighbor

Decision Tree

LinearFunctions

( ) Tg b x w x

NonlinearFunctions

Page 18: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

35/104

Linear Discriminant Function

• g(x) is a linear function:

( ) Tg b x w x

x1

x2

wT x + b < 0

wT x + b > 0

• A hyper-plane in the feature space

• (Unit-length) normal vector of the hyper-plane:

w

nw

n

36/104

Linear Discriminant Function

• How would you classify these points using a linear discriminant function in order to minimize the error rate?

=> Infinite number of answers!

denotes +1

denotes -1

x1

x2

Page 19: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

37/104

Linear Discriminant Function

x1

x2

denotes +1

denotes -1• How would you classify these

points using a linear discriminant function in order to minimize the error rate?

=> Infinite number of answers!

=> Which one is the best?

38/104

“safe zone” Margin

x1

x2

denotes +1

denotes -1

Large Margin Linear Classifier

• The linear discriminant function (classifier) with the maximum margin is the best

• Margin is defined as the width that the boundary could be increased by before hitting a data point

• Why it is the best?- Robust to outliners and thus strong generalization ability

Page 20: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

39/104

x1

x2

denotes +1

denotes -1• Given a set of data points:

• With a scale transformation on both w and b, the above is equivalent to

For 1, 0

For 1, 0

Ti i

Ti i

y b

y b

w x

w x

{( , )}, 1, 2, ,i iy i nx , where

For 1, 1

For 1, 1

Ti i

Ti i

y b

y b

w x

w x

Large Margin Linear Classifier

40/104

Large Margin Linear Classifier

x1

x2

denotes +1

denotes -1

Margin

x+

x+

x-n

Support Vectors

• We know that

• The margin width is:

1

1

T

T

b

b

w x

w x

( )

2 ( )

M

x x n

wx x

w w

Page 21: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

41/104

Large Margin Linear Classifier

x1

x2

denotes +1

denotes -1

Margin

x+

x+

x-n

• Formulation:

such that

2maximize

w

For 1, 1

For 1, 1

Ti i

Ti i

y b

y b

w x

w x

42/104

Large Margin Linear Classifier

• Formulation:

such that

2maximize

w

For 1, 1

For 1, 1

Ti i

Ti i

y b

y b

w x

w x

• Formulation:

21minimize

2w

such that

For 1, 1

For 1, 1

Ti i

Ti i

y b

y b

w x

w x

( ) 1Ti iy b w x

or

Page 22: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

43/104

Solving the Optimization Problem

( ) 1Ti iy b w x

21minimize

2w

s.t.

Quadratic programming

with linear constraints

2

1

1minimize ( , , ) ( ) 1

2

nT

p i i i ii

L b y b

w w w x

s.t.

LagrangianFunction

0i

44/104

2

1

1minimize ( , , ) ( ) 1

2

nT

p i i i ii

L b y b

w w w x

s.t. 0i

0pL

b

0pL

w 1

n

i i ii

y

w x

1

0n

i ii

y

Solving the Optimization Problem

Page 23: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

45/104

Solving the Optimization Problem

• The solution has the form:

( ) 1 0Ti i iy b w x

• From KKT condition:

• Thus, only support vectors have 0i

1 SV

n

i i i i i ii i

y y

w x x

get from ( ) 1 0,

where is support vector

Ti i

i

b y b w x

x

x1

x2

x+

x+

x-

Support Vectors

KKT condition: URL: https://en.wikipedia.org/wiki/Karush%E2%80%93Kuhn%E2%80%93Tucker_conditions

46/104

Solving the Optimization Problem

SV

( ) T Ti i

i

g b b

x w x x x

• The linear discriminant function is:

• Notice it relies on a dot product between the test point x and the support vectors xi

• Also keep in mind that solving the optimization problem involved computing the dot products xi

Txj between all pairs of training points

Page 24: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

47/104

Large Margin Linear Classifier

• What if data is not linear separable? (noisy data, outliers, etc.)

• Slack variables ξi can be added to allow mis-classification of difficult or noisy data points

x1

x2

denotes +1

denotes -1

12

48/104

Large Margin Linear Classifier

• Formulation:

( ) 1Ti i iy b w x

2

1

1minimize

2

n

ii

C

w

such that

0i

Parameter C can be viewed as a way to control over-fitting.

Page 25: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

49/104

Non-linear SVMs• Datasets that are linearly separable with noise work out great:

0 x

0 x

• But what are we going to do if the dataset is just too hard?

0 x

x2

• How about… mapping data to a higher-dimensional space:

50/104

Non-linear SVMs: Feature Space

• General idea: the original input space can be mapped to some higher-dimensional feature space where the training set is separable:

Φ: x→ φ(x)

Page 26: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

51/104

Applications of SVMs

• Bioinformatics• Machine Vision• Text Categorization• Ranking (e.g., Google searches)• Handwritten Character Recognition• Time series analysis

Lots of very successful applications!!!

52/104

Car license plate extraction

Precision=62%, Recall=50.2%False negative rate=49.8%False positive rate=1.1%

• Extract SIFT Feature first

• Classification using SVM

ClassificationResults

Ground Truth

Outside Inside

Outside 293734 (TN) 3398 (FP)

Inside 5505 (FN) 5545 (TP)

Page 27: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

53/104

Summary: Support Vector Machine

• Large Margin Classifier – Better generalization ability & less over-fitting

• The Kernel Trick– Map data points to higher dimensional space in order to

make them linearly separable.

– Since only dot product is used, we do not need to represent the mapping explicitly.

54/104

Precision and Recall

• Precision: fraction of retrieved docs that are relevant = P(relevant|retrieved)

• Recall: fraction of relevant docs that are retrieved = P(retrieved|relevant)

Precision P = tp/(tp + fp)

Recall R = tp/(tp + fn)

Page 28: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

55/104

Precision and Recall

• Car license plate extraction example

- TP(True Positive) : 번호판을 번호판으로 판단

- FP(False Positive) : 배경을 번호판으로 판단

- FN(False Negative) : 번호판을 배경으로 판단

- TN(True Negative) : 배경을 배경으로 판단

Precision(정확도) : 번호판으로 판단한 영역 중실제로 번호판인 비율

Recall(재현율, 검출율) : 전체 번호판 중실제로 번호판으로 판단한 비율

56/104

DL(Deep Learning)

• what exactly is deep learning ? - ‘Deep Learning’ means using a neural network

with several layers of nodes between input and output

• why is it generally better than other methods on image, speech and certain other types of data? - the series of layers between input & output do

feature identification and processing in a series of stages,

just as our brains seem to.

These slides are courtesy of https://www.macs.hw.ac.uk/~dwcorne/

Page 29: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

57/104

DL(Deep Learning)

• multilayer neural networks have been around for

25 years. What’s actually new?

we have always had good algorithms for learning the

weights in networks with 1 hidden layer

but these algorithms are not good at learning the weights for

networks with more hidden layers

what’s new is: algorithms for training many-later networks

58/104

Brain vs. Computer

1. 10 billion neurons2. 60 trillion synapses3. Distributed processing4. Nonlinear processing5. Parallel processing

1. Faster than neuron (10-9 sec)cf. neuron: 10-3 sec

3. Central processing4. Arithmetic operation (linearity)5. Sequential processing

Page 30: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

59/104

Neuron vs Artificial Neuron

수상돌기 세포체 축색돌기

60/104

Simple Perceptron

Page 31: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

61/104

Classic Perceptron

Sigmoid UnitSigmoid function is

Differentiable (x) (x)(1 (x))x

Nonlinear Neuron :Sigmoid Unit

62/104

Multilayer Perceptron

Page 32: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

63/104

W1

W2

W3

f(x)

1.4

-2.5

-0.06

64/104

2.7

-8.6

0.002

f(x)

1.4

-2.5

-0.06

x = -0.06×2.7 + 2.5×8.6 + 1.4×0.002 = 21.34

Page 33: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

65/104

Training the neural network Fields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

66/104

Training dataFields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

Feed it through to get output

1.4

2.7 0.8

1.9

Page 34: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

67/104

Training dataFields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

Compare with target output

1.4

2.7 0.8 0

1.9 error 0.8

68/104

Training dataFields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

Adjust weights based on error

1.4

2.7 0.8 0

1.9 error 0.8

Page 35: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

69/104

Training dataFields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

Feed it through to get output

6.4

2.8 0.9

1.7

70/104

Training dataFields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

Compare with target output

6.4

2.8 0.9 1

1.7 error -0.1

Page 36: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

71/104

Training dataFields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

Adjust weights based on error

6.4

2.8 0.91

1.7 error -0.1

72/104

Training dataFields class1.4 2.7 1.9 03.8 3.4 3.2 06.4 2.8 1.7 14.1 0.1 0.2 0etc …

And so on ….

6.4

2.8 0.91

1.7 error -0.1

Repeat this thousands, maybe millions of times – each timetaking a random training instance, and making slight weight adjustmentsAlgorithms for weight adjustment are designed to make

changes that will reduce the error

Page 37: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

73/104

The decision boundary perspective…

Initial random weights

74/104

The decision boundary perspective…

Present a training instance / adjust the weights

Page 38: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

75/104

The decision boundary perspective…

Eventually ….

76/104

The point I am trying to make

• weight-learning algorithms for NNs are dumb

• they work by making thousands and thousands of tiny adjustments, each making the network do better at the most recent pattern, but perhaps a little worse on many others

• but, by dumb luck, eventually this tends to be good enough to

learn effective classifiers for many real applications

Page 39: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

77/104

Some other points

If f(x) is non-linear, a network with 1 hidden layer can, in theory, learn perfectly any classification problem. A set of weights exists that can produce the targets from the inputs. The problem is finding them.

78/104

Some other ‘by the way’ points

If f(x) is linear, the NN can only draw straight decision boundaries (even if there are many layers of units)

Page 40: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

79/104

Some other ‘by the way’ points

NNs use nonlinear f(x) so they

can draw complex boundaries,

but keep the data unchanged

80/104

Some other ‘by the way’ points

NNs use nonlinear f(x) so they SVMs only draw straight lines,

can draw complex boundaries, but they transform the data first

but keep the data unchanged in a way that makes that OK

Page 41: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

81/104

Feature detectors

82/104

what is this unit doing?

Page 42: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

83/104

Hidden layer units become self-organised feature detectors

1

63

1 5 10 15 20 25 …

84/104

What does this unit detect?

1

63

1 5 10 15 20 25 …

Page 43: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

85/104

What does this unit detect?

1

63

1 5 10 15 20 25 …

it will send strong signal for a horizontalline in the top row, ignoring everywhere else

86/104

What does this unit detect?

1

63

1 5 10 15 20 25 …

Page 44: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

87/104

What does this unit detect?

1

63

1 5 10 15 20 25 …

Strong signal for a dark area in the top leftcorner

88/104

What features might you expect a good NNto learn, when trained with data like this?

Page 45: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

89/10463

1

vertical lines

90/10463

1

Horizontal lines

Page 46: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

91/10463

1

Small circles

92/10463

1

Small circles

But what about position invariance ???our example unit detectors were tied to specific parts of the image

Page 47: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

93/104

successive layers can learn higher-level features …

etc …detect lines inSpecific positions

v

Higher level detetors( horizontal line, “RHS vertical lune”“upper loop”, etc…

etc …

94/104

successive layers can learn higher-level features …

etc …detect lines inSpecific positions

v

Higher level detetors( horizontal line, “RHS vertical lune”“upper loop”, etc…

etc …

What does this unit detect?

Page 48: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

95/104

So: multiple layers make sense

Your brain works that way

96/104

So: multiple layers make sense

Many-layer neural network architectures should be capable of learning the true underlying features and ‘feature logic’, and therefore generalise very well …

Page 49: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

97/104

But, until very recently, our weight-learning algorithms simply did not work on multi-layer architectures

98/104

The new way to train multi-layer NNs…

Page 50: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

99/104

The new way to train multi-layer NNs…

Train this layer first

100/104

The new way to train multi-layer NNs…

Train this layer first

then this layer

then this layer

then this layerfinally this layer

Page 51: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

101/104

Final layer trained to predict class based on outputs from previous layers

102/104

And that’s that

• That’s the basic idea

• There are many many types of deep learning,

• different kinds of autoencoder, variations on architectures and training algorithms, etc…

• Very fast growing area …

Page 52: C03 Machine Vision Algorithms.ppt [호환 모드]dasan.sejong.ac.kr/~dihan/cv/C03_Machine Vision... ·  · 2016-03-21Machine Vision Algorithms Computer Engineering, Sejong University

103/104

DL Application

104/104

DL Summary

• Much recent excitement, still much to be discovered

• "Google-Brain"• Sum of Products Nets• Biological Plausibility• Potential for significant improvements• Good in structured spaces

– Important research question: To what extent can we use Deep Learning in more arbitrary feature spaces?

– Recent deep training of MLPs with BP shows potential in this area