11 Selected Applications of Transfer Learning 杨强， Qiang Yang Department of Computer Science and Engineering The Hong Kong University of Science and Technology

11

Selected Applications of Transfer Learning

杨强， Qiang Yang

Department of Computer Science and EngineeringThe Hong Kong University of Science and Technology

Hong Kong

http://www.cse.ust.hk/~qyang

http://www.cse.ust.hk/~qyang

2

Case 1: 目标变化目标迁移 Target Class Changes Target

Transfer Learning Training: 2 class problem Testing: 10 class problem. Traditional methods fail

Solution: find out what is not changed bewteen training and testing

3

Our Work Cross-Domain Learning

TrAdaBoosting (ICML 2007) Co-Clustering based Classification

(SIGKDD 2007) TPLSA (SIGIR 2008) NBTC (AAAI 2007)

Translated Learning Cross-lingual classification (in WWW

2008) Cross-media classification (In NIPS 2008)

Unsupervised Transfer Learning Self-taught clustering (ICML 2008)

4

Our Work (cont) Wenyuan Dai, Yuqiang Chen, Gui-Rong Xue, Qiang Yang, and Yong Yu.

Translated Learning. In Proceedings of Twenty-Second Annual Conference on Neural Information Processing Systems (NIPS 2008), December 8, 2008, Vancouver, British Columbia, Canada. (Link)

Xiao Ling, Wenyuan Dai, Gui-Rong Xue, Qiang Yang, and Yong Yu. Cross-Domain Spectral Learning. In Proceedings of the Fourteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM KDD 2008), Las Vegas, Nevada, USA, August 24-27, 2008. 488-496 (PDF)

Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Self-taught Clustering. In Proceedings of the 25th International Conference on Machine Learning (ICML 2008), Helsinki, Finland, 5-9 July, 2008. 200-207 (PDF)

Wenyuan Dai, Qiang Yang, Gui-Rong Xue and Yong Yu. Boosting for Transfer Learning. In Proceedings of The 24th Annual International Conference on Machine Learning (ICML'07) Corvallis, Oregon, USA, June 20-24, 2007. 193 - 200 (PDF)

Wenyuan Dai, Gui-Rong Xue, Qiang Yang and Yong Yu. Co-clustering based Classification for Out-of-domain Documents. In Proceedings of the Thirteenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM KDD'07), San Jose, California, USA, Aug 12-15, 2007. Pages 210-219 (PDF)

Dou Shen, Jian-Tao Sun, Qiang Yang and Zheng Chen. Building Bridges for Web Query Classification. In Proceedings of the 29th ACM International Conference on Research and Development in Information Retrieval (ACM SIGIR 06). Seattle, USA, August 6-11, 2006. Pages 131-138. (PDF)

http://nips.cc/Conferences/2008/Program/speaker-info.php?ID=5407

http://www.cse.ust.hk/~qyang/Docs/2008/p488-ling.pdf

http://www.cse.ust.hk/~qyang/Docs/2008/dwyak%20icml.pdf

http://www.cse.ust.hk/~qyang/Docs/2007/tradaboost.pdf

http://www.cse.ust.hk/~qyang/Docs/2007/daikdd.pdf

http://www.cse.ust.hk/~qyang/Docs/2006/p131-shen.pdf

5

Query Classification and Online Advertisement ACM KDDCUP 05

Winner SIGIR 06 ACM Transactions

on Information Systems Journal 2006 Joint work with Dou

Shen, Jiantao Sun and Zheng Chen

66

QC as Machine Learning

Inspired by the KDDCUP’05 competition Classify a query into a ranked list of

categories Queries are collected from real search

engines Target categories are organized in a tree

with each node being a category

77

Related Works

Document/Query Expansion

Borrow text from extra data source

Using hyperlink [Glover 2002];

Using implicit links from query log [Shen 2006];

Using existing taxonomies [Gabrilovich 2005];

Query expansion [Manning 2007]

Global methods: independent of the queries

Local methods using relevance feedback or pseudo-relevance feedback

Query Classification/Clustering

Classify the Web queries by geographical locality [Gravano 2003];

Classify queries according to their functional types [Kang 2003];

Beitzel et al. studied the topical classification as we do. However they have manually classified data [Beitzel 2005];

Beeferman and Wen worked on query clustering using clickthrough data respectively [Beeferman 2000; Wen 2001];

8

Target-transfer Learning in QC Classifier, once trained, stays constant

Target Classes Before Sports, Politics (European, US, China)

Target Classes Now Sports (Olympics, Football, NBA), Stock Market (Asian,

Dow, Nasdaq), History (Chinese, World) How to allow target to change?

Application: advertisements come and go, but our querytarget mapping needs not be

retrained! We call this the target-transfer learning

problem

99

Solutions: Query Enrichment + Staged Classification

Target Categories

Queries

Solution: Bridging classifier

Construction of Synonym- based

Classifiers

Construction of Statistical Classifier

QuerySearch Engine

Labels of Returned

Pages

Text of Returned

Pages

Classified results

Classified results

Finial ResultsPhase II: the testing phase

Phase I: the training phase

The Architecture of Our Approach

1010

Category information

Full text

Step 1: Query enrichment Textual information

TitleSnippet Category

1111

Step 2: Bridging Classifier

Wish to avoid: When target is changed, training needs to

repeat! Solution:

Connect the target taxonomy and queries by taking an intermediate taxonomy as a bridge

1212

Bridging Classifier (Cont.)

How to connect?

Prior prob. of IjC

The relation between and I

jC

TiC

The relation between and I

jC

q

The relation between andTiC

q

1313

Category Selection for Intermediate Taxonomy

Category Selection for Reducing Complexity

Total Probability (TP)

Mutual Information

1414 / 68

Experiment─ Data Sets & Evaluation

ACM KDDCUP Starting 1997, ACM KDDCup is the leading Data Mining and

Knowledge Discovery competition in the world, organized by ACM SIG-KDD.

ACM KDDCUP 2005 Task: Categorize 800K search queries into 67 categories Three Awards (1) Performance Award ; (2) Precision Award; (3) Creativity

Award Participation

142 registered groups; 37 solutions submitted from 32 teams

Evaluation data 800 queries randomly selected from the 800K query set 3 human labelers labeled the entire evaluation query set

Evaluation measurements: Precision and Performance (F1) We won all three. a

3

1

i)labeler human against (F13

1 F1 Overall

i

15

Result of Bridging Classifiers

Using bridging classifier allows the target classes to change freely

no the need to retrain the classifier!

Performance of the Bridging Classifier with Different Granularity of Intermediate Taxonomy

16

Summary: Target-Transfer Learning

Query

IntermediateClass

Target class

Similarity classify to

17

Cross-Domain Learning

LearningInput Output

18

Case 1 Source

Many labeled instances Target

Few labeled instances Target and source domains

Same feature representation Same classes Y (binary classes) Different P(X,Y) distribution

1919

TrAdaBoost = Transfer AdaBoost (cont.)

Given Insufficient labeled data from the target domain

(primary data) Labeled data following a different distribution

(auxiliary data) The auxiliary data are weaker evidence for building

the classifierTarget training source + target Uniform weights (X)

2020


Misclassified examples: increase the weights of the misclassified

target data decrease the weights of the misclassified

source data

21


Performance

22

Transfer Learning in Sensor Transfer Learning in Sensor Network TrackingNetwork Tracking

Received-Signal-Strength (RSS) based localization in an Indoor WiFi environment.

Where is the mobile device?

-40dBm-70dBm-30dBm

Access point 2

Access point 1

Access point 3

Mobile device

(location_x, location_y)

23

Distribution Changes

The mapping function f learned in the offline phase can be out of date.

Recollecting the WiFi data is very expensive. How to adapt the model ?

TimeNight time period Day time period0t 1t

24

Transfer Learning in Wireless Sensor Networks

Transfer across time Transfer across space Transfer across device

25

Latent Space based Transfer Learning Latent Space based Transfer Learning (Spatial Transfer) (Spatial Transfer) Transfer Localization Models across Space [Pan, Yang et al. AAAI 08]

Some labeled data collected in Area A and unlabeled data in B;

Only a few labeled data collected in Area B;

Want to: Construct a

localization model of the whole area (Area A and Area B)

26

Transfer across timeTransfer across time

Area: 30 X 40 (81 grids)

Six time periods: 12:30am--01:30am08:30am--09:30am12:30pm--01:30pm04:30pm--05:30pm08:30pm--09:30pm10:30pm--11:30pm

LeMan:

Static mapping function learnt from offline data;

LeMan2:

Relearn the mapping function from a few online data

LeMan3:

Combine offline and online data as a whole training data to learn the mapping function.

27

Transfer knowledge via latent manifold learning

Latent Manifold

Labeled WiFi DataLabeled WiFi Data

Knowledge Propagation

28

VIP Recommendation in Tencent Weibo

Friendship relations in Tencent QQ, which is the largest instant messenge network

KnowledgeTransfer

Properties:1. Data Sparsity: limited neighbors for most users

2. Heterogeneous Links: symmetric friendship vs. asymmetric following

3. Large Data: 1 billion users and tens of billion links

VIP Recommendation Based on One's 1. X: Friendship on QQ2. S1: User Following Relations on Tencent Weibo3. S2: VIP Following Relations on Tencent Weibo

Social Relation based Transfer (SORT)

30

Social App Recommendation in Tecent Qzone

Qzone (http://qzone.qq.com) is the largest social network in China.

Other Applications

Video Recommendation in Tencent Video

Four types of auxiliary data 1. binary ratings2. social networks3. context4. video content

Rating Prediction

Activity Recognition With sensor data collected on mobile

devices Location

GPS, Wifi, RFID Context: location, weather, etc.

From GPS, RFID, Bluetooth, etc. Various models can be used

Non-sequential models: Naïve Bayes, SVM …

Sequential models: HMM, CRF …

Activity Recognition: Input & Output (Vincent Zheng, A* Sg) Input

Context and locations Time, history, current/previous locations, duration,

speed, Object Usage Information

Trained AR Model Training data from calibration Calibration Tool: VTrack

Output: Predicted Activity Labels

Running? Walking? Tooth brushing? Having lunch?

32

http://www.cse.ust.hk/~vincentz/Vtrack.html

Datasets: MIT PlaceLab http://architecture.mit.edu/house_n/placelab.html

MIT PlaceLab Dataset (PLIA2) [Intille et al. Pervasive 2005]

Activities: Common household activities

33

Cross Domain Activity Recognition [Zheng, Hu, Yang, Ubicomp 2009]

Challenges: A new domain of

activities without labeled data

Cross-domain activity recognition

Transfer some available labeled data from source activities to help training the recognizer for the target activities.

34

CleaningIndoor

Laundry

Dishwashing

How to use the similarities?

35

Source Domain

Labeled Data

Source Domain

Labeled Data

Similarity

Measure

Similarity

Measure

<Sensor Reading, Activity Name>

Example: <SS, “Make Coffee”>

<Sensor Reading, Activity Name>

Example: <SS, “Make Coffee”>

Example: sim(“Make

Coffee”, “Make Tea”) = 0.6

Example: sim(“Make

Coffee”, “Make Tea”) = 0.6

Example: Pseudo Training Data: <SS, “Make Tea”, 0.6>

Target Domain Pseudo

Labeled Data

Target Domain Pseudo

Labeled Data

Weighted SVM Classifier

Weighted SVM Classifier

THE WEB

Calculating Activity Similarities How similar are

two activities?◦ Use Web search

results◦ TFIDF: Traditional IR

similarity metrics (cosine similarity)

◦ Example Mined similarity

between the activity “sweeping” and “vacuuming”, “making the bed”, “gardening”

Calculated Similarity with the activity "Sweeping"

Similarity with the activity "Sweeping"

36