46
Data Mining Lab, Big Data Research Center, UESTC Email[email protected] DM LESS IS MORE Representation-based Transfer Learning and Some Advances Wei Han

Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

Data Mining Lab, Big Data Research Center, UESTCEmail:[email protected]

DMLESS IS MORE

Representation-based Transfer Learning and Some Advances

Wei Han

Page 2: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室Outline

1. Motivation

2. Definition of Transfer Learning

2.1. Problem formulation

2.2. Confusing related areas

3. Representation-based Transfer Learning

3.1. Basic idea and classical methods

3.2. Some Interesting and Advances

4. Future Works

Page 3: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室

Motivation

Page 4: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室1.1. Exciting foresight

Page 5: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室1.2. A Motivating Example

• Goal: to train a robot to accomplish Task 𝑻1 in an indoor

environment 𝑬1 using machine learning techniques:

• Sufficient training data required: sensor readings to measure the

environment as well as human supervision, i.e. labels

• A predictive model can be learned, and used in the same

environment

Task 𝑻1 in environment 𝑬1

Page 6: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室1.2. A Motivating Example

?Environment changes 𝑬2Task 𝑻1

?Task 𝑻2

?New robot

Page 7: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室1.3. Motivation

• People is of capacity to learn quickly and precisely with

priori knowledge of target domain and the knowledge

transferred from the similar but different (source) domain

• Performance of traditional machine learning techniques

highly relies on whether sufficient labeled data is

available to build a predictive model

• When environment changes (e.g., new domain or new

task), the learned predictive model performs poorly

Page 8: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室

Definition

Page 9: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.1.1. Intuitive View

• Inspired by human’s transfer of learning ability

• The ability of a system to recognize and apply knowledge

and skills learned in previous domains/tasks to novel

tasks/domains, which share some commonality

Page 10: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.1.2. Formal Concepts

• Common concepts of TL• Domain: A domain D consists of two components: a feature space X and

a marginal probability distribution P(X), where 𝑋 = {𝑥1, … , 𝑥𝑛} ∈ 𝒳

• Task: A task consists of two components: a label space Y and an

objective predictive function 𝑓(·) (denoted by 𝑇 = {𝑦, 𝑓(·)})

• Formal Definition of TL• Condition: Given a source domain 𝐷𝑆 and learning task 𝑇𝑆, a target

domain 𝐷𝑇 and learning task 𝑇𝑇

• Goal: Transfer learning aims to help improve the learning of the target

predictive function 𝑓𝑇(·) in 𝐷𝑇 using the knowledge in 𝐷𝑆 and 𝑇𝑆

• Limitation: where 𝐷𝑆 ≠ 𝐷𝑇, or 𝑇𝑆 ≠ 𝑇𝑇

Page 11: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.1.3. Further Understanding

• Given a target domain/task, transfer learning aims to

1) identify the commonality between the target domain/task

and previous domains/tasks

2) transfer knowledge from the previous domains/tasks to the

target one such that human supervision on the target

domain/task can be dramatically reduced.

Source Domain

/ Task Data

Target Domain

/ Task Data

Predictive

Models

Sufficient labeled

training data

Unlabeled training/with a few labeled data

Transfer Learning

Algorithms

Target Domain

/ Task Data

Testing

Page 12: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.2.1. Category of Transfer Learning

Learning Setting Source and Target Domain Source and Target Task

Traditional Machine Learning The same The same

Inductive Transfer Learning/

Unsupervised Transfer Learning

The same Different but related

Different but related Different but related

Transductive Transfer Learning Different but related The same

• The category of transfer learning problems

Page 13: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.2.1. Category of Transfer Learning

Transfer learning approaches Description

Instance-transfer To re-weight some labeled data in a source

domain for use in the target domain

Feature-representation-transfer Find a “good” feature representation that reduces

difference between a source and a target domain

or minimizes error of models

Model(Parameter)-transfer Discover shared parameters or priors of models

between a source domain and a target domain

Relational-knowledge-transfer Build mapping of relational knowledge between a

source domain and a target domain.

• Basic ideas of transfer learning approaches

Page 14: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.2.1. Category of Transfer Learning

Inductive

Transfer Learning

Transductive

Transfer Learning

Unsupervised

Transfer Learning

Instance-transfer √ √

Feature-representation-

transfer

√ √ √

Model(Parameter)-

transfer

Relational-knowledge-

transfer

• Scope of approaches

Page 15: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.2.1. Other Categories

• Homogeneous/Heterogeneous transfer learning

𝑋𝑆 ≠ 𝑋𝑇, 𝑃(𝑋𝑆) ≠ 𝑃(𝑋𝑇), 𝑌𝑆 ≠ 𝑌𝑇

• Symmetric/Asymmetric feature-based transfer learning

Page 16: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室2.2.2. Confusing Related Areas

• Directly Similar:

• Domain adaptation

• Multi-view learning

• Zero-shot/Few-shot learning

• Multi-task learning

• Indirectly Similar:

• Learning to learn

• Label embedding/Attribute

• Continuous learning

• Lifelong learning

Page 17: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室

Representation-based

Transfer Learning

Page 18: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.1.1. Overview

• Assumption:Source and target domains only have some

overlapping features

• Idea:Through feature transformation, the data in two

domain are merged into one feature space

• Classical Methods:➢ Transfer component analysis (TCA) [Pan, TKDE-11]

➢ Geodesic flow kernel (GFK) [Duan, CVPR-12]

➢ Transfer kernel learning [Long, TKDE-15]

➢ ……

Page 19: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.1.2. Transfer Component Analysis

• Main idea: the learned 𝜑 should map the source domain and

target domain data to a latent space spanned by the factors that

reduce domain distance as well as preserve data structure

Page 20: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.1.2. Transfer Component Analysis

• High level optimization problem

min Dist 𝜑 𝑋𝑆 ,𝜑 𝑋𝑇𝜑

s.t.

+ 𝜆Ω(𝜑)

constraints on 𝜑 𝑋𝑆 and 𝜑 𝑋𝑇

Maximum Mean Discrepancy (MMD)

Page 21: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.1.3. Geodesic Flow Kernel

Idea: Data are mapped into manifold space, and the distance

between two domain is measured and minimized

Page 22: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.1. Leverage Domain-specific Info

Electronics Video Games

(1) Compact; easy to operate;

very good picture quality;

looks sharp!

(2)A very good game! It is

action packed and full of

excitement. I am very much

hooked on this game.

(3) I purchased this unit from

Circuit City and I was very action and good plots. We

excited about the quality of the played this and were hooked.

picture. It is really nice and

sharp.

(4) Very realistic shooting

(5) It is also quite blurry in

very dark settings. I will

never_buy HP again.

(6) The game is so boring. I

am extremely unhappy and

will probably never_buy

UbiSoft again.

TL for Sentiment Classification

Page 23: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.1. Leverage Domain-specific Info

• Three different types of features

– Source domain (Electronics) specific features, e.g.,

compact, sharp, blurry

– Target domain (Video Game) specific features, e.g.,

hooked, realistic, boring

– Domain independent features (pivot features), e.g.,

good, excited, nice, never_buy

Page 24: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.1. Leverage Domain-specific Info

• Intuition

– Use pivot features as a bridge to connect domain- specific features

– Model correlations between pivot features and domain-specific

features

– Discover new shared features by exploiting the feature

correlations

• How to select pivot features?

– Term frequency on both source and target domain data.

– Mutual information between features and source domain labels

– Mutual information on between features and domains

Page 25: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.1. Leverage Domain-specific Info

➢If two domain-specific words have connections to more common pivot words in

the graph, they tend to be aligned or clustered together with a higher probability.

➢If two pivot words have connections to more common domain-specific words in

the graph, they tend to be aligned together with a higher probability.

exciting

good

never_buy

sharp

boring

blurry

hooked

compact

realisticPivot features

Domain-specific features

7

6

38

6

2

4

5

Electronics

Video Game

Spectral Feature Alignment (SFA)

Page 26: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.1. Leverage Domain-specific Info

exciting

good

never_buy

sharp

boring

blurry

hooked

compact

realisticPivot features

Domain-specific features

7

6

38

6

2

4

5

Electronics

Video Game

boringcompact

realistic

hooked

blurry

sharpElectronics

Video GameElectronics

Electronics Video Game

Video Game

Spectral Clustering

Page 27: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.1. Leverage Domain-specific Info

y f (x) sgn(w xT ), w [1,1,1]

Electronics

Video Game

Training

Prediction

sharp/hooked compact/realistic blurry/boring

1 1 0

1 0 0

0 0 1

sharp/hooked compact/realistic blurry/boring

1 0 0

1 1 0

0 0 1

Page 28: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.2. Multi-source Transfer Learning

Transitive transfer learning [Tan, KDD-15]In two dissimilar domains, intermediate domains are leveraged to

help knowledge transform

Page 29: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.2. Multi-source Transfer Learning

How to select intermediate domain?

• Domain complexity

The domain complexity is calculated as the percentage of long tail features

that have low frequency.

• A-distance

The A-distance estimates the distribution difference of two sets of data

samples that are drawn from two probability distributions

Page 30: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.2. Multi-source Transfer Learning

Intermediate Domain Selection

• Given a triple tr = {S, D, T}

• Take six measurements as features to construct a logistic

regression model

• Estimate variables with MLE

Page 31: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.2. Multi-source Transfer Learning

Nonnegative Matrix tri-Factorization:

• The matrix 𝐹 ∈ ℝ𝑚×𝑝indicates the information of feature

clusters and p is the number of hidden feature clusters

• The matrix G ∈ ℝ𝑐×𝑛 is the instance cluster assignment matrix

and c is the number of instance clusters

• 𝐴 ∈ ℝ𝑝×𝑐 is the association matrix. c is the number of instance

clusters or label classes

Page 32: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.3. Distant Transfer Learning

Distant domain TL [Tan, AAAI-17]

In the transferring between two highly dissimilar domains, the

autoencoder is used to select unlabeled intermediate data from

multiple auxiliary domain.

Page 33: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.3. Distant Transfer Learning

Idea: Reconstruction errors on the source domain data and the

target domain data are both small

Page 34: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.3. Distant Transfer Learning

Page 35: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.3. Distant Transfer Learning

Results:

Page 36: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.4. Label Embedding/Attribute for TL

Simultaneous Deep Transfer Across Domains and

Tasks [Tzeng, ICCV-15]Simultaneously optimizes for domain invariance to facilitate

domain transfer and uses a soft label distribution matching

loss to transfer information between tasks

Page 37: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.4. Label Embedding/Attribute for TL

Page 38: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.5. DNN for Transfer Learning

SHL-MDNN [Huang, ICASSP-13]Sharing hidden layers in DNN model, and learning different tasks

by different softmax layers

Page 39: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.6. GAN for Transfer Learning

Adversarial Discriminative Domain Adaptation

[Tzeng, arXiv-17]

Page 40: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.6. GAN for Transfer Learning

GoGAN [Liu, NIPS-16]

Page 41: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.6. GAN for Transfer Learning

CDD [Fu, arXiv-17]

Page 42: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.6. GAN for Transfer Learning

Domain-Adversarial Training of Neural Networks

[Ganin, JMLR-16]

Page 43: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室3.2.6. GAN for Transfer Learning

SBAGA-GAN [Russo, arXiv-17]

Page 44: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室

Future Work

Page 45: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

DMLESS IS MORE Data Mining Lab

数据挖掘实验室4.1. Open Questions

• Theoretical study beyond generalization error bound

(Negative transfer learning, Domain similarity metric)

• Given a source domain and a target domain, determine

whether transfer learning should be performed

• For a specific transfer learning method, given a source and a

target domain, determine whether the method should be used

for knowledge transfer

• Good (Interpretive) representation

• Transfer learning with plenty of source domains

• Online transfer learning

• Transfer learning for deep reinforcement learning

• Lifelong continuous learning

Page 46: Representation-based Transfer Learning and Some Advances Wei …dm.uestc.edu.cn › wp-content › uploads › seminar › 20170726_Repres… · Simultaneous Deep Transfer Across

Thank

you

DMLESS IS MORE