87
딥러닝 구현 기법 “using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터 Heechul Jung 2016 대한임베디드공학회 추계학술대회 Tutorial

딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

딥러닝구현기법 “using Caffe”

2016-11-11

DGIST 미래자동차융합연구센터

Heechul Jung

2016 대한임베디드공학회추계학술대회 Tutorial

Page 2: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Learning Deep Learning. (Research-oriented)

Install linux.(Ubuntu, fedora, ..)

Read papers.(ArXiv, CVPR, ICCV, NIPS, ICML, ICLR)

Install deep learning tool.(Caffe, torch, tensorflow..)

Learning deep learning tool.

Reproduce state-of-the-art algorithms. (Baseline)

1 day

several weeks

1 day

1 day

Implementing Idea.

few weeks

Writing a paper?

Page 3: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Real-time Object Recognition

CPU : Intel i5-4690 CPU 3.5GHzRAM : 18GBGPU : NVIDIA Geforce GTX770

Page 4: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

How to use Caffe.

Page 5: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

What is Caffe?Open framework, models, and worked examplesfor deep learning

- < 2 years

- 600+ citations, 100+ contributors, 6,000+ stars

- 3,400+ forks

- focus has been vision, but branching out:sequences, reinforcement learning, speech + text

Prototype Train Deploy

Page 6: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

What is Caffe?Open framework, models, and worked examplesfor deep learning- Pure C++ / CUDA architecture for deep learning- Command line, Python, MATLAB interfaces

- Fast, well-tested code

- Tools, reference models, demos, and recipes

- Seamless switch between CPU and GPU

Prototype Train Deploy

Page 7: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Caffe is a Community project pulse

Page 8: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Open Model Collection

The Caffe Model Zooopen collection of deep models to share innovationVGG ILSVRC14 + Devil models in the zooNetwork-in-Network / CCCP model in the zooMIT Places scene recognition model in the zoo

help disseminate and reproduce research

bundled tools for loading and publishing models

Share Your Models! with your citation + license of course

Page 9: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Recipe for Brewing

Buy NVIDIA graphic cards.Install Caffe.Convert the data to Caffe-format

lmdb, leveldb, hdf5 / .mat, list of images, etc.

Define the Net (prototxt)Configure the Solver (prototxt)caffe train -solver solver.prototxt -gpu 0

Examples are your friendscaffe/examples/mnist,cifar10,imagenet

caffe/examples/*.ipynb

caffe/models/*

Page 10: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Choose your graphic card.

NVIDIA K40Performance is best with ECC off and boost clock enabled. While ECC makes a negligible difference in speed, disabling it frees 1 GB of GPU memory.Best settings with ECC off and maximum clock speed in standard Caffe:• Training is 26.5 secs / 20 iterations (5,120 images)• Testing is 100 secs / validation set (50,000 images)Best settings with Caffe + cuDNN acceleration:• Training is 19.2 secs / 20 iterations (5,120 images)• Testing is 60.7 secs / validation set (50,000 images)

NVIDIA TitanTraining: 26.26 secs / 20 iterations (5,120 images). Testing: 100 secs / validation set (50,000 images).cuDNN Training: 20.25 secs / 20 iterations (5,120 images). cuDNN Testing: 66.3 secs / validation set (50,000images).

NVIDIA K20Training: 36.0 secs / 20 iterations (5,120 images). Testing: 133 secs / validation set (50,000 images).

NVIDIA GTX 770Training: 33.0 secs / 20 iterations (5,120 images). Testing: 129 secs / validation set (50,000 images).cuDNN Training: 24.3 secs / 20 iterations (5,120 images). cuDNN Testing: 104 secs / validation set (50,000images).

Page 11: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Installationdetailed documentation:

http://caffe.berkeleyvision.org/installation.html

required packages:CUDA, OPENCVBLAS (Basic Linear Algebra Subprograms):

operations like matrix multiplication, matrix addition, both implementation for CPU(cBLAS) andGPU(cuBLAS). provided by MKL(INTEL), ATLAS, openBLAS, etc.

Boost: a c++ library.> Use some of its math functions and shared_pointer.

glog,gflags provide logging & command line utilities.> Essential for debugging.

leveldb, lmdb: database io for your program.> Need to know this for preparing your own data.

protobuf: an efficient and flexible way to define data structure.> Need to know this for defining new layers.

Page 12: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your task

Dog?Cat?

Page 13: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Next stepPreparing data => LevelDB, LMDB

Model Definition (tran_val.prototxt)

Solver (solver.prototxt)

DownloadImage Data

LevelDB,LMDB

train_val.prototxt

solver.prototxt

Page 14: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Preparing dataIf you want to run CNN on other dataset:

caffe reads data in a standard database format.

You have to convert your data to leveldb/lmdb manually.

Creating image set

for imagenet dataset…

Using LMDB

./convert_imageset --resize_height=256 --resize_width=256 --shuffle ./data/imagenet ./data/imagenet/train.txt ./ilsvrc12_train_lmdb --backend=lmdb

Page 15: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

Define data layer.

Define specific layers.

Convolution layer.

Fully connected layer (Inner product layer)

Pooling layer.

Activation function layer.

Define loss function.

Page 16: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

LogReg ↑

"dummy-net"

{ name:

{ name:

{ name:

"data" …}

"conv" …}

"pool" …}

more layers …

name:

layers

layers

layers

layers { name: "loss" …}

net:

blue: layers you need to define

yellow: data blobs

LeNet →

examples/mnist/lenet_train.prototxt

ImageNet, Krizhevsky 2012 →

Page 17: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

Data Layerlayer {name: "cifar"type: "Data"top: "data"top: "label"include {phase: TRAIN

}transform_param {mean_file: "examples/cifar10/mean.binaryproto"

}data_param {source: "examples/cifar10/cifar10_train_lmdb"batch_size: 100backend: LMDB

}}

layer {name: "cifar"type: "Data"top: "data"top: "label"include {phase: TEST

}transform_param {mean_file: "examples/cifar10/mean.binaryproto"

}data_param {source: "examples/cifar10/cifar10_test_lmdb"batch_size: 100backend: LMDB

}}

Page 18: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

Conv Layerlayer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1

}param {lr_mult: 2

}convolution_param {num_output: 32pad: 2kernel_size: 5stride: 1weight_filler {type: "gaussian"std: 0.0001

}bias_filler {type: "constant"

}}

}

data

convolution

Convolution•Layer type: Convolution

•CPU implementation: ./src/caffe/layers/convolution_layer.cpp

•CUDA GPU implementation: ./src/caffe/layers/convolution_layer.cu

•Parameters (ConvolutionParameter convolution_param)

•Required•num_output (c_o): the number of filters

•kernel_size (or kernel_h and kernel_w): specifies height and width of

each filter

•Strongly Recommended•weight_filler [default type: 'constant' value: 0]

•Optional•bias_term [default true]: specifies whether to learn and apply a set of

additive biases to the filter outputs•pad (or pad_h and pad_w) [default 0]: specifies the number of pixels to

(implicitly) add to each side of the input•stride (or stride_h and stride_w) [default 1]: specifies the intervals at

which to apply the filters to the input•group (g) [default 1]: If g > 1, we restrict the connectivity of each filter to a

subset of the input. Specifically, the input and output channels are separated

into g groups, and the ith output group channels will be only connected to

the ith input group channels.

Page 19: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

Fully connected layerlayer {name: "ip1"type: "InnerProduct"bottom: "pool3"top: "ip1"param {lr_mult: 1

}param {lr_mult: 2

}inner_product_param {num_output: 64weight_filler {type: "gaussian"std: 0.1

}bias_filler {type: "constant"

}}

}

Inner Product•Layer type: InnerProduct

•CPU implementation: ./src/caffe/layers/inner_product_layer.cpp•CUDA GPU implementation: ./src/caffe/layers/inner_product_layer.cu•Parameters (InnerProductParameter inner_product_param)

•Required•num_output (c_o): the number of filters

•Strongly recommended•weight_filler [default type: 'constant' value: 0]

•Optional•bias_filler [default type: 'constant' value: 0]•bias_term [default true]: specifies whether to learn and apply a set of additive biases to the filter outputs

Page 20: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

Pooling layerPooling•Layer type: Pooling

•CPU implementation: ./src/caffe/layers/pooling_layer.cpp•CUDA GPU implementation: ./src/caffe/layers/pooling_layer.cu•Parameters (PoolingParameter pooling_param)

•Required•kernel_size (or kernel_h and kernel_w): specifies height and width of each filter

•Optional•pool [default MAX]: the pooling method. Currently MAX,

AVE, or STOCHASTIC•pad (or pad_h and pad_w) [default 0]: specifies the number of pixels to (implicitly) add to each side of the input•stride (or stride_h and stride_w) [default 1]: specifies the intervals at which to apply the filters to the input

layer {name: "pool1"type: "Pooling"bottom: "conv1"top: "pool1"pooling_param {pool: MAXkernel_size: 3stride: 2

}}

Page 21: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

Activation function

layer {name: "relu1"type: "ReLU"bottom: "pool1"top: "pool1"

}

ReLU / Rectified-Linear and Leaky-ReLU•Layer type: ReLU

•CPU implementation: ./src/caffe/layers/relu_layer.cpp•CUDA GPU implementation: ./src/caffe/layers/relu_layer.cu•Parameters (ReLUParameter relu_param)

•Optional•negative_slope [default 0]: specifies whether to

leak the negative part by multiplying it with the

slope value rather than setting it to 0.

Page 22: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Define your network (train_val.prototxt)

Loss layer

layer {name: "loss"type: "SoftmaxWithLoss"bottom: "ip2"bottom: "label"top: "loss"

}

ClassificationSoftmaxWithLoss

HingeLoss

Linear RegressionEuclideanLoss

Attributes / MulticlassificationSigmoidCrossEntropyLoss

Others…

New TaskNewLoss

Page 23: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

DataCon-volve

PoolCon-volve

PoolInner Prod

...Rect-ify

Rect-ify

Pre-dict

Label

Loss

network does not need to be linear

linear network:

DataCon-volve

PoolCon-volve

PoolInner Prod

...Rect-ify

Rect-ify

Pre-dict

Label

Loss

? ?

?

...

...

?

?

? ? Sum

directed acyclic graph:

—> a little more about the network

Define your network (train_val.prototxt)

23

Page 24: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Solving: Training a Net (solver.prototxt)Optimization like model definition is configuration.

train_net: "lenet_train.prototxt"base_lr: 0.01

momentum: 0.9

weight_decay: 0.0005

max_iter: 10000

snapshot_prefix: "lenet_snapshot"

All you need to run things on the GPU.

> caffe train -solver lenet_solver.prototxt -gpu 0

Stochastic Gradient Descent (SGD) + momentumAdaptive Gradient (ADAGRAD) · Nesterov’s Accelerated Gradient (NAG)

Page 25: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Solving: Training a Net (solver.prototxt)net: "train_val.prototxt"

test_iter: 100

test_interval: 500

base_lr: 0.001

momentum: 0.9

weight_decay: 0.004

lr_policy: "fixed"

display: 100

max_iter: 4000

snapshot: 4000

snapshot_prefix: "examples/cifar10/cifar10_quick"

solver_mode: GPU

Model definition file

Iteration for test

Test interval

Learning rateMomentum

Weight decayLearning rate policy

Print

Max iteration number for traning

Save

Save filename

CPU/GPU

Page 26: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Additional detailsDownload caffe (https://github.com/BVLC/caffe)

Installsudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libboost-all-dev libhdf5-serial-dev libgflags-dev libgoogle-glog-dev liblmdb-dev protobuf-compiler libatlas-base-dev

CUDNN (optional)Download from NVIDIAsudo cp lib* /usr/local/cuda/lib64/sudo cp cudnn.h /usr/local/cuda/include/

Change config file name In caffe folder: “Makefile.config.example” “Makefile.config”

To use CUDNN (optional)In file “Makefile.config”: # USE_CUDNN := 1 USE_CUDNN := 1

CompileIn caffe folder: make all –j8 faster

Page 27: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Download DataExecute in folder “caffe/data/cifar10”: sh get_cifar10.sh

Create DataMove file: “caffe/examples/cifar10/create_cifar10.sh” “caffe/create_cifar10.sh”

Execute in folder “caffe/”: sh create_cifar10.sh

cifar10_test_lmdb

cifar10_train_lmdb

mean.binaryproto

TrainMove file: “caffe/examples/cifar10/train_quick.sh” “caffe/train_quick.sh”

Execute in folder “caffe/”: sh train_quick.sh75.11%

Page 28: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Distribute your network.“.caffemodel”

weight parameters.

“_deploy.prototxt”model definition file.

layer {name: "ip1"type: "InnerProduct"bottom: "pool3"top: "ip1"param {lr_mult: 1decay_mult: 250

}param {lr_mult: 2decay_mult: 0

}inner_product_param {num_output: 10

}}layer {name: "prob"type: "Softmax"bottom: "ip1"top: "prob"

}

name: "CIFAR10_full_deploy"input: "data"input_dim: 1input_dim: 3input_dim: 32input_dim: 32layer {name: "conv1"type: "Convolution"bottom: "data"top: "conv1"param {lr_mult: 1

}param {lr_mult: 2

}convolution_param {num_output: 32pad: 2kernel_size: 5stride: 1

}}

Page 29: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Finetuning modelsExample

ImageNet dataset => Style dataset

Different DB, the number of class.

Page 30: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Finetuning models

● Simply change a few lines in the layer definition new name = new params

—> what if you want to transfer the weight of a existing model to finetune another dataset / task

Input:A differentsource

Last Layer:A differentclassifier

layers {

name: "data"

type: “Data”

data_param {

source:

"ilsvrc12_train_leveldb"

"./data/mean_file:

ilsvrc12"

...

}

...

...

layers {

name: "fc8"

type:"InnerProduct”

blobs_lr: 1

blobs_lr: 2

weight_decay: 1

weight_decay: 0

inner_product_param {

num_output: 1000

...

}

layers {

name: "data"

type: “Data”

data_param {

source: "style_leveldb"

mean_file: "./data/

ilsvrc12"

...

}

...

}

...

layers {

name: "fc8-style"

type: "InnerProduct”

blobs_lr: 1

blobs_lr: 2

weight_decay: 1

weight_decay: 0

inner_product_param {

num_output: 20

...

}

Page 31: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

—solver

—weights

models/finetune_flickr_style/solver.prototxt

bvlc_reference_caffenet.caffemodel

> finetune_net.bin solver.prototxt model_file

old caffe:

new caffe:

> caffe train

net =new Caffe::Net("style_solver.prototxt");

net.CopyTrainedNetFrom(pretrained_model);

solver.Solve(net);

Finetuning models

Page 32: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Making Deep Residual Network.

Page 33: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Revolution of Depth

3.57

6.7 7.3

16.4

11.7

25.8

28.2

ILSVRC'15 ILSVRC'14 ILSVRC'14 ILSVRC'13 ILSVRC'12 ILSVRC'11 ILSVRC'10

ResNet GoogleNet VGG AlexNet

ImageNet Classification top-5 error (%)

shallow8 layers

19 layers22 layers

152 layers

8 layers

Very deep

Ultra deep

Deep

Page 34: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Deep Residual Network

Making .prototxt

layer {name: "cifar"type: "Data"top: "data"top: "label"include {phase: TRAIN

}transform_param {mean_file: "examples/cifar10/mean.binaryproto"

}data_param {source: "examples/cifar10/cifar10_train_lmdb"batch_size: 100backend: LMDB

}}

2734 lines/ 44 layers

1202 layers??????2734/44*1202

=74687.909… lines?If 1 line = 1 sec,

it takes approximately 20 hours.

Page 35: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

CONV

Batch Normalization

Scale Layer

ReLU

Page 36: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

PYTHON Implementation

from caffe import layers as L

def conv_bn_cifar10(bottom, nout, ks = 3, stride=1, pad = 0, is_test = True, learn = True):

if learn:

param = [dict(lr_mult=1, decay_mult=1)]

else:

param = [dict(lr_mult=0, decay_mult=0), dict(lr_mult=0, decay_mult=0)]

conv = L.Convolution(bottom, kernel_size=ks, stride=stride,

num_output=nout, pad=pad, param = param, weight_filler=dict(type="msra"), bias_filler=dict(type="constant"), bias_term = False)

bn = L.BatchNorm(conv, param=[dict(lr_mult=0), dict(lr_mult=0), dict(lr_mult=0)], batch_norm_param=dict(use_global_stats=is_test))

scale = L.Scale(bn)

relu = L.ReLU(scale)

return conv, bn, scale, relu

CONV

Batch Normalization

Scale Layer

ReLU

Page 37: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Issues when Training Neural Networks

Most slides were obtained from Stanford C231n (Prof. Fei-Fei Li).

Page 38: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

ConvNets need a lot of data to train.

Page 39: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

ImageNet data

1. Train on ImageNet 2. Finetune network on

your own data

your data

Page 40: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

1. Train on ImageNet

2. If small dataset: fix all weights (treat CNN as fixed feature extractor), retrain only theclassifier

i.e. swap the Softmax layer at the end

3. If you have medium sized dataset, “finetune” instead: use the old weights as initialization, train the full network or only some of the higherlayers

retrain bigger portion of the network, or even all of it.

Transfer Learning

Page 41: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

E.g. Caffe Model Zoo: Lots of pretrained ConvNetshttps://github.com/BVLC/caffe/wiki/Model-Zoo

...

Page 42: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Tuning Learning Rate.

Page 43: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Double check that the loss is reasonable:

returns the loss and the g

radient for all parameters

disable regularization

loss ~2.3. “

correct “ for

10 classes

Page 44: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Double check that the loss is reasonable:

crank up regularization

loss went up, good. (sanity check)

Page 45: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

Tip: Make sure that

you can overfit very

small portion of the

training dataThe above code:

- take the first 20 examples from

CIFAR-10- turn off regularization (reg = 0.0)

- use simple vanilla ‘sgd’

45

Page 46: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

Tip: Make sure that

you can overfit very

small portion of the

training data

Very small loss, tra

in accuracy 1.00, n

ice!

46

Page 47: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

I like to start with small

regularization and find

learning rate that mak

es the loss go down.

47

Page 48: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

I like to start with small

regularization and find

learning rate that mak

es the loss go down.

Loss barely changing

Page 49: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

I like to start with small

regularization and find

learning rate that mak

es the loss go down.

loss not going down:

learning rate too low

Loss barely changing: Learning rate is

probably too low

Page 50: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

I like to start with small

regularization and find

learning rate that mak

es the loss go down.

loss not going down:

learning rate too low

Loss barely changing: Learning rate is

probably too low

Notice train/val accuracy goes to 20% t

hough, what’s up with that? (remember

this is softmax)

Page 51: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

I like to start with small

regularization and find

learning rate that mak

es the loss go down.

loss not going down:

learning rate too low

Okay now lets try learning rate 1e6. What could

possibly go wrong?

Page 52: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

cost: NaN almost

always means high

learning rate...

Lets try to train now…

I like to start with small

regularization and find

learning rate that mak

es the loss go down.

loss not going down:

learning rate too low l

oss exploding: learni

ng rate too high

52

Page 53: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Lets try to train now…

I like to start with small

regularization and find

learning rate that mak

es the loss go down.

loss not going down:

learning rate too low l

oss exploding: learni

ng rate too high

3e-3 is still too high. Cost explodes….

=> Rough range for learning rate we

should be cross-validating is some

where [1e-3 … 1e-5]

Page 54: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Monitor and visualize the loss curve

Page 55: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Monitor and visualize the accuracy:

big gap = overfitting

=> increase regularization strength?

no gap=> increase model capacity?

Page 56: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Squeezing out the last few percent.

Page 57: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

VGG Net VGG Net (Oxford)

The second places in the classification tasks. Stacked 3x3 filter No LRN layers Stride 1

Page 58: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

The power of small filters

Suppose we stack two CONV layers with receptive field size 3x3

=> Each neuron in 1st CONV sees a 3x3 region of input.

1st CONV neuron

view of the input:

(and stride 1)

Page 59: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

The power of small filters

Suppose we stack two CONV layers with receptive field size 3x3

=> Each neuron in 1st CONV sees a 3x3 region of input.

Q: What region of input does each neuron in 2nd CONV see?

2nd CONV neuron

view of 1st conv:

Page 60: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Suppose we stack two CONV layers with receptive field size 3x3

=> Each neuron in 1st CONV sees a 3x3 region of input.

Q: What region of input does each neuron in 2nd CONV see?

X2nd CONV neuron

view of input: Answer: [5x5]

The power of small filters

Page 61: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Suppose we stack three CONV layers with receptive field size 3x3

Q: What region of input does each neuron in 3rd CONV see?

3rd CONV neuron

view of 2nd CONV:

The power of small filters

Page 62: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Suppose we stack three CONV layers with receptive field size 3x3

Q: What region of input does each neuron in 3rd CONV see?

X

X

Answer: [7x7]

The power of small filters

Page 63: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Suppose input has depth C & we want output depth C as well

1x CONV with 7x7 filters 3x CONV with 3x3 filters

Number of weights: Number of weights:

The power of small filters

Page 64: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Number of weights:

C*(7*7*C)

= 49 C^2

Number of weights:

Suppose input has depth C & we want output depth C as well

1x CONV with 7x7 filters 3x CONV with 3x3 filters

The power of small filters

Page 65: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Number of weights:

C*(7*7*C)

= 49 C^2

Number of weights:

C*(3*3*C) + C*(3*3*C) + C*(3*3*C)

= 3 * 9 * C^2

= 27 C^2

Suppose input has depth C & we want output depth C as well

1x CONV with 7x7 filters 3x CONV with 3x3 filters

The power of small filters

Page 66: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Number of weights:

C*(7*7*C)

= 49 C^2

Number of weights:

C*(3*3*C) + C*(3*3*C) + C*(3*3*C)

= 3 * 9 * C^2

= 27 C^2

Fewer parameters and more nonlinearities = GOOD.

Suppose input has depth C & we want output depth C as well

1x CONV with 7x7 filters 3x CONV with 3x3 filters

The power of small filters

Page 67: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

“More non-linearities” and “deeper” usually gives better

performance.

[Network in Network, Lin et al. 2013]

The power of small filters

Page 68: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

“More non-linearities” and “deeper” usually gives better

performance.

=> 1x1 CONV!(Usually follows a normal CONV, e.g.

[3x3 CONV - 1x1 CONV]

[Network in Network, Lin et al. 2013]

The power of small filters

Page 69: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

[Network in Network, Lin et al. 2013]

“More non-linearities” and “deeper” usually gives better

performance.

=> 1x1 CONV!(Usually follows a normal CONV, e.g.

[3x3 CONV - 1x1 CONV]

3x3 CONV view of input

1x1 CONV view of output

of 3x3 CONV

The power of small filters

Page 70: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

“More non-linearities” and “deeper” usually gives better

performance.

=> 1x1 CONV!(Usually follows a normal CONV, e.g.

[3x3 CONV - 1x1 CONV]

[Network in Network, Lin et al. 2013]

The power of small filters

Page 71: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

[Very Deep Convolutional Networks for Large-Scale Image Recognition, Simonyan et al., 2014]

=> Evidence that using 3x3 instead of

1x1 works better

The power of small filters

Page 72: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

• i.e. simulating “fake” data

• explicitly encoding image transfor

mations that shouldn’t change obj

ect identity.

Data Augmentation

What the computer sees

Page 73: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

1. Flip horizontally

Data Augmentation

73

Page 74: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

2. Random crops/scales

Sample these during training (al

so helps a lot during test time)

e.g. common to see even up to 150 crops used

Data Augmentation

Page 75: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

3. Random mix/combinations of :

- translation

- rotation

- stretching

- shearing,

- lens distortions, … (go crazy)

Data Augmentation

75

Page 76: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

4. Color jittering(maybe even contrast jittering, etc.)

- Simple: Change contrast

small amounts, jitter the

color distributions, etc.

- Vignette,... (go crazy)

Data Augmentation

Page 77: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Data Augmentation

4. Color jittering(maybe even contrast jittering, etc.)

- Simple: Change contrast

small amounts, jitter the

color distributions, etc.

Fancy PCA way:1. Compute PCA on all [R,G,

B] points values in the tra

ining data

2. sample some color offset

along the principal comp

onents at each forward

pass

3. add the offset to all pixels in

a training image

(As seen in [Krizhevsky et al. 2012])

77

Page 78: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Notice the more general theme:1. Introduce a form of randomness in forward pass

2. Marginalize over the noise distribution during prediction

DropConnect

Dropout

Data Augmentation,

Model Ensembles

Page 79: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Real-time application using Caffe.

Page 80: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

ImageNet CompetitionTotal 1000 classes

Each class has about 1300 images for training. (1300 x 1000 = 1,300,000)

It takes about one week for training CNN model.

Deep Learning

Hand-crafted

Page 81: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Real-time Object Recognition VGG Net (Oxford)

The second places in the classification tasks. Stacked 3x3 filter No LRN layers Stride 1

Page 82: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Real-time Object Recognition (contd.)

CPU : Intel i5-4690 CPU 3.5GHzRAM : 18GBGPU : NVIDIA Geforce GTX770

Page 83: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Implementation Detail

Step 1. Windows 7 64bit / NVIDIA Graphic card (optional)

Step 2. Install CUDA 6.5

Step 3. Download Caffe-windows (http://github.com/niuzhiheng/caffe)

Step 4. Download 3rd party libraries (http://github.com/niuzhiheng/caffe)

Step 5. Download VGG pre-trained weights / architecture

(http://www.robots.ox.ac.uk/~vgg/research/very_deep/)

Step 6. Implement Code

Page 84: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Implementation Detail (contd.)

CameraFrame

(Image)

CNN(Forward

Propagation)

Result(Top5)

OpenCV

Cuda

Caffe

Resizing

Page 85: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Implementation Detail (contd.)

// Test modeCaffe::set_phase(Caffe::TEST);

// mode setting - CPU/GPUCaffe::set_mode(Caffe::GPU);

// gpu device numberint device_id = 0;Caffe::SetDevice(device_id);

// prototxtNet<float> caffe_test_net("VGG_ILSVRC_19_layers_deploy.prototxt");

// caffemodel(weight)caffe_test_net.CopyTrainedLayersFrom("VGG_ILSVRC_19_layers.caffemodel");

name: "VGG_ILSVRC_19_layers"input: "data"input_dim: 1input_dim: 3input_dim: 224input_dim: 224layers {bottom: "data"top: "conv1_1"name: "conv1_1"type: CONVOLUTIONconvolution_param {

num_output: 64pad: 1kernel_size: 3

}}layers {bottom: "conv1_1"top: "conv1_1"name: "relu1_1"type: RELU

}

<prototxt>http://caffe.berkeleyvision.org/

Page 86: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

for (k=0; k<3; k++){

for (i=0; i<IMAGE_SIZE; i++){

for (j=0; j< IMAGE_SIZE; j++){

blob.mutable_cpu_data()[blob.offset(0, k, i, j)] = (float)(unsigned char)small_image->imageData[i*small_image->widthStep+j*small_image->nChannels+k] - mean_val[k];

}}

}input_vec.push_back(&blob);

// forward propagationfloat loss;const vector<Blob<float>*>& result = caffe_test_net.Forward(input_vec, &loss);

// copy outputfor(i=0; i<1000; i++){

output[i] = result[0]->cpu_data()[i];}

Page 87: 딥러닝구현기법 “usingiemek.org/UploadData/Editor/BBS3/201611/3A8F0C20342940... · 2016-11-25 · 딥러닝구현기법“using Caffe” 2016-11-11 DGIST 미래자동차융합연구센터

Thank You!!E-mail : [email protected]