48
Domain Adversarial Training of Neural Network PR12와 함께 이해하는 * Domain Adversarial Training of Neural Network, Y. Ganin et al. 2016를 바탕으로 작성한 리뷰 Jaejun Yoo Ph.D. Candidate @KAIST PR12 4 TH MAY, 2017

[Pr12] dann jaejun yoo

Embed Size (px)

Citation preview

Page 1: [Pr12] dann   jaejun yoo

Domain Adversarial Training of Neural Network

PR12와 함께 이해하는

* Domain Adversarial Training of Neural Network, Y. Ganin et al. 2016를 바탕으로 작성한 리뷰

Jaejun Yoo

Ph.D. Candidate @KAIST

PR12

4TH MAY, 2017

Page 2: [Pr12] dann   jaejun yoo

Usually we try to…

Test(target)

Training(source)

Page 3: [Pr12] dann   jaejun yoo

For simplicity, let’s consider the binary classification problem

Page 4: [Pr12] dann   jaejun yoo
Page 5: [Pr12] dann   jaejun yoo

일반적인 supervised learning setting: Training과 test의 domain이 같다고 가정.

Page 6: [Pr12] dann   jaejun yoo
Page 7: [Pr12] dann   jaejun yoo
Page 8: [Pr12] dann   jaejun yoo
Page 9: [Pr12] dann   jaejun yoo

TAXONOMY OF TRANSFER LEARNING

Page 10: [Pr12] dann   jaejun yoo
Page 11: [Pr12] dann   jaejun yoo
Page 12: [Pr12] dann   jaejun yoo
Page 13: [Pr12] dann   jaejun yoo

전자기기 고객평가 (X) / 긍정 혹은 부정 라벨 (Y)

Page 14: [Pr12] dann   jaejun yoo

전자기기 고객평가 (X) / 긍정 혹은 부정 라벨 (Y)

비디오 게임 고객평가 (X)

Page 15: [Pr12] dann   jaejun yoo

전자기기 고객평가 (X) / 긍정 혹은 부정 라벨 (Y)

비디오 게임 고객평가 (X)

NN으로 표현되는 H 함수 공간으로부터….

Page 16: [Pr12] dann   jaejun yoo

전자기기 고객평가 (X) / 긍정 혹은 부정 라벨 (Y)

비디오 게임 고객평가 (X)

Classifier h를 학습하는데, target의 label을 모르지만source(X,Y)와 target(X)두 도메인 모두에서 잘 label을 찾는 h를 찾고 싶다.

NN으로 표현되는 H 함수 공간으로부터….

Page 17: [Pr12] dann   jaejun yoo

DANN

Page 18: [Pr12] dann   jaejun yoo

DANN

TRY TO CLASSIFY WELL WITH THE EXTRACTED FEATURE!

Ordinary classification POSITIVE

NEGATIVE

고객 평가 댓글

Page 19: [Pr12] dann   jaejun yoo

DANN

Ordinary classification

Domain Classification전자기기

비디오 게임

TRY TO CLASSIFY WELL WITH THE EXTRACTED FEATURE!

POSITIVE

NEGATIVE

고객 평가 댓글

Page 20: [Pr12] dann   jaejun yoo

DANN

Ordinary classification

Domain Classification전자기기

비디오 게임

TRY TO CLASSIFY WELL WITH THE EXTRACTED FEATURE!

POSITIVE

NEGATIVE

고객 평가 댓글

TRY TO EXTRACT DOMAIN INDEPENDENT FEATURE!

Page 21: [Pr12] dann   jaejun yoo

DANN

Ordinary classification

Domain Classification전자기기

비디오 게임

TRY TO CLASSIFY WELL WITH THE EXTRACTED FEATURE!

POSITIVE

NEGATIVE

고객 평가 댓글

TRY TO EXTRACT DOMAIN INDEPENDENT FEATURE!

e.g. f : compact, sharp, blurry→ easy to discriminate the domain

⇓f : good, excited, nice, never buy, …

Page 22: [Pr12] dann   jaejun yoo

• Combining DA and feature learning within one training process

• Principled way to learn a good representation based on the

generalization guarantee

: minimize the H divergence directly (no heuristic)

“When or when not the DA algorithm works.”

“Why it works.”

DANN

Page 23: [Pr12] dann   jaejun yoo

기존 전략: 최대한 적은 parameter로 training error가 최소인 model을 찾자

Page 24: [Pr12] dann   jaejun yoo

이제는 training domain (source)과 testing domain (target)이 서로 다르다

기존의 전략 외에 다른 전략이 추가로 필요하다.

Page 25: [Pr12] dann   jaejun yoo
Page 26: [Pr12] dann   jaejun yoo

PREREQUISITE

Different distances

Slide courtesy of Sungbin Lim, DeepBio, 2017

Page 27: [Pr12] dann   jaejun yoo

= 0

Page 28: [Pr12] dann   jaejun yoo

A Bound on the Adaptation Error

1. Difference across all measurable subsets cannot be estimated from

finite samples

2. We’re only interested in differences related to classification error

Page 29: [Pr12] dann   jaejun yoo

Idea: Measure subsets where hypotheses in disagree

Subsets A are error sets of one hypothesis wrt another

1. Always lower than L1

2. computable from finite unlabeled samples. (Kifer et al. 2004)

3. train classifier to discriminate between source and target data

Page 30: [Pr12] dann   jaejun yoo

A Computable Adaptation Bound

Divergence estimation

complexity

Dependent on number

of unlabeled samples

Page 31: [Pr12] dann   jaejun yoo

The optimal joint hypothesis

is the hypothesis with minimal combined error

is that error

Page 32: [Pr12] dann   jaejun yoo

THANKS TO GENERALIZATION GUARANTEE

Page 33: [Pr12] dann   jaejun yoo

THEORETICAL RESULTS

Page 34: [Pr12] dann   jaejun yoo

THEORETICAL RESULTS

𝒉 ∈ 𝑯⟺ 𝟏− 𝒉 ∈ 𝑯

Page 35: [Pr12] dann   jaejun yoo

THEORETICAL RESULTS

Page 36: [Pr12] dann   jaejun yoo

THEORETICAL RESULTS

Page 37: [Pr12] dann   jaejun yoo

DANN

Page 38: [Pr12] dann   jaejun yoo

DANN

Page 39: [Pr12] dann   jaejun yoo

DANN

Page 40: [Pr12] dann   jaejun yoo

DANN

Page 41: [Pr12] dann   jaejun yoo

DANN

Page 42: [Pr12] dann   jaejun yoo

DANN

Page 43: [Pr12] dann   jaejun yoo

DANN

Page 44: [Pr12] dann   jaejun yoo
Page 45: [Pr12] dann   jaejun yoo

SHALLOW DANN

Page 46: [Pr12] dann   jaejun yoo

SHALLOW DANN

Page 47: [Pr12] dann   jaejun yoo

tSNE RESULTS

Page 48: [Pr12] dann   jaejun yoo

REFERENCE

PAPERS

1. A survey on transfer learning, SJ Pan 2009

2. A theory of learning from different domains, S Ben-David et al. 2010

3. Domain-Adversarial Training of Neural Networks, Y Ganin 2016

BLOG

1. http://jaejunyoo.blogspot.com/2017/01/domain-adversarial-training-of-neural.html

2. https://github.com/jaejun-yoo/tf-dann-py35

3. https://github.com/jaejun-yoo/shallow-DANN-two-moon-dataset

SLIDES

1. http://www.di.ens.fr/~germain/talks/nips2014_dann_slides.pdf

2. http://john.blitzer.com/talks/icmltutorial_2010.pdf (DA theory part)

3. https://epat2014.sciencesconf.org/conference/epat2014/pages/slides_DA_epat_17.pdf (DA theory part)

4. https://www.slideshare.net/butest/ppt-3860159 (DA theory part)

VIDEO

1. https://www.youtube.com/watch?v=h8tXDbywcdQ (Terry Um 딥러닝 토크)

2. https://www.youtube.com/watch?v=F2OJ0fAK46Q (DA theory part)

3. https://www.youtube.com/watch?v=uc6K6tRHMAA&index=13&list=WL&t=2570s (DA theory part)